`

Load the tweets and check if they are loaded correctly. We also check the summary for a first interpretation. The summary(tweets) output reveals the following:

# Set working directory
# getwd()
# setwd("./data/")

# Load data
load("./Tweets_all.rda")

# Check that tweets are loaded
head(tweets)
## # A tibble: 6 × 14
##   created_at               id id_str            full_text in_reply_to_screen_n…¹
##   <dttm>                <dbl> <chr>             <chr>     <chr>                 
## 1 2023-01-20 17:17:32 1.62e18 1616469988369469… "Im MSc … <NA>                  
## 2 2023-01-13 07:52:01 1.61e18 1613790954737074… "Was bew… <NA>                  
## 3 2023-01-12 19:30:01 1.61e18 1613604227141537… "Was uns… <NA>                  
## 4 2023-01-12 08:23:00 1.61e18 1613436367169634… "Eine di… <NA>                  
## 5 2023-01-11 14:00:05 1.61e18 1613158809081450… "Wir gra… <NA>                  
## 6 2023-01-10 17:06:11 1.61e18 1612843252083834… "Unsere … <NA>                  
## # ℹ abbreviated name: ¹​in_reply_to_screen_name
## # ℹ 9 more variables: retweet_count <int>, favorite_count <int>, lang <chr>,
## #   university <chr>, tweet_date <dttm>, tweet_minute <dttm>,
## #   tweet_hour <dttm>, tweet_month <date>, timeofday_hour <chr>
summary(tweets)
##    created_at                          id               id_str         
##  Min.   :2009-09-29 14:29:47.0   Min.   :4.469e+09   Length:19575      
##  1st Qu.:2015-01-28 15:07:41.5   1st Qu.:5.604e+17   Class :character  
##  Median :2018-04-13 13:26:56.0   Median :9.848e+17   Mode  :character  
##  Mean   :2017-12-09 15:26:50.7   Mean   :9.400e+17                     
##  3rd Qu.:2020-10-20 10:34:50.0   3rd Qu.:1.318e+18                     
##  Max.   :2023-01-26 14:49:31.0   Max.   :1.619e+18                     
##   full_text         in_reply_to_screen_name retweet_count     favorite_count  
##  Length:19575       Length:19575            Min.   :  0.000   Min.   :  0.00  
##  Class :character   Class :character        1st Qu.:  0.000   1st Qu.:  0.00  
##  Mode  :character   Mode  :character        Median :  1.000   Median :  0.00  
##                                             Mean   :  1.289   Mean   :  1.37  
##                                             3rd Qu.:  2.000   3rd Qu.:  2.00  
##                                             Max.   :267.000   Max.   :188.00  
##      lang            university          tweet_date                    
##  Length:19575       Length:19575       Min.   :2009-09-29 00:00:00.00  
##  Class :character   Class :character   1st Qu.:2015-01-28 00:00:00.00  
##  Mode  :character   Mode  :character   Median :2018-04-13 00:00:00.00  
##                                        Mean   :2017-12-09 02:25:45.00  
##                                        3rd Qu.:2020-10-20 00:00:00.00  
##                                        Max.   :2023-01-26 00:00:00.00  
##   tweet_minute                      tweet_hour                    
##  Min.   :2009-09-29 14:29:00.00   Min.   :2009-09-29 14:00:00.00  
##  1st Qu.:2015-01-28 15:07:00.00   1st Qu.:2015-01-28 14:30:00.00  
##  Median :2018-04-13 13:26:00.00   Median :2018-04-13 13:00:00.00  
##  Mean   :2017-12-09 15:26:24.68   Mean   :2017-12-09 14:59:43.81  
##  3rd Qu.:2020-10-20 10:34:30.00   3rd Qu.:2020-10-20 10:00:00.00  
##  Max.   :2023-01-26 14:49:00.00   Max.   :2023-01-26 14:00:00.00  
##   tweet_month         timeofday_hour    
##  Min.   :2009-09-01   Length:19575      
##  1st Qu.:2015-01-01   Class :character  
##  Median :2018-04-01   Mode  :character  
##  Mean   :2017-11-24                     
##  3rd Qu.:2020-10-01                     
##  Max.   :2023-01-01

Start preprocessing the tweets, to calculate the intervalls some additional properties are needed. The preprocessing steps transform raw tweet data into a structured format suitable for analysis. This includes:

# Preprocessing Step: Convert date and time to POSIXct and format according to date, year and university. Detect language and extract emojis. The days are sorted from the system locale starting from monday
tweets <- tweets %>%
  mutate(
    created_at = as.POSIXct(created_at, format = "%Y-%m-%d %H:%M:%S"),
    date = as.Date(created_at),
    day = lubridate::wday(created_at,
      label = TRUE, abbr = FALSE,
      week_start = getOption("lubridate.week.start", 1),
      locale = Sys.getlocale("LC_TIME")
    ),
    year = year(created_at),
    month = floor_date(created_at, "month"),
    university = as.character(university),
    full_text_emojis = replace_emoji(full_text, emoji_dt = lexicon::hash_emojis)
  )

# Remove Emoji Tags helper funciton
# replace emoji places the emojis in the text as tags and their name, we remove them here
remove_emoji_tags <- function(text) {
  str_remove_all(text, "<[a-z0-9]{2}>")
}
# Remove Emoji Tags
tweets$full_text_emojis <- sapply(tweets$full_text_emojis, remove_emoji_tags)

# Store emojis in a sep arate column to analyze later
tweets$emoji_unicode <- tweets %>%
  emoji_extract_nest(full_text) %>%
  select(.emoji_unicode)

Question 1: How many tweets are being posted by the various Universities when? Are there any ‘release’ strategies visible?

Most Active Hours:

Each university has a distinct peak hour for tweeting, often aligning with typical working hours (9 AM - 5 PM). This suggests a strategic approach to reach their target audience when they are most likely online. The most active hours for each university are as follows:

  • FHNW: 9 AM
  • FH Graubünden: 11 AM
  • ZHAW: 5 PM
  • BFH: 8 AM
  • HES-SO: 10 AM
  • HSLU: 9 AM
  • OST-FH: 8 AM
  • SUPSI-CH: 11 AM

These times typically align with standard working hours, indicating a strategic approach to reach their audience during times they are most likely to be online. It appears that a typical worker is more productive and active on Twitter in the morning, with motivation waning around midday and continuing to decline until the end of the workday.

Most Active Days:

There isn’t a consistent “most active day” across universities. Some favor weekdays, while others show higher activity on weekends. This could reflect differences in their target audience or the nature of their content.

  • FHNW: Tuesday
  • FH Graubünden: Tuesday
  • ZHAW: Wednesday
  • BFH: Tuesday
  • HES-SO: Tuesday
  • HSLU: Thursday
  • OST-FH: Friday
  • SUPSI-CH: Friday

The pattern also suggests that tweet activity tends to be higher earlier in the week, with motivation and tweet frequency potentially falling as the week progresses.

# Count each tweet by university and hour of the day
tweet_counts_by_hour_of_day <- tweets %>%
  group_by(university, timeofday_hour) %>%
  count() %>%
  arrange(university, timeofday_hour)

# Plot the number of tweets by university and hour of the day
ggplot(
  tweet_counts_by_hour_of_day,
  aes(
    x = timeofday_hour, y = n,
    color = university, group = university
  )
) +
  geom_line() +
  facet_wrap(~university) +
  labs(
    title = "Number of tweets by university and hour",
    x = "Hour of day",
    y = "Number of tweets"
  )

# Show most active hours for each university
hours_with_most_tweets_by_uni <- tweet_counts_by_hour_of_day %>%
  group_by(university, timeofday_hour) %>%
  summarize(total_tweets = sum(n)) %>%
  group_by(university) %>%
  slice_max(n = 1, order_by = total_tweets)

print(hours_with_most_tweets_by_uni)
## # A tibble: 8 × 3
## # Groups:   university [8]
##   university     timeofday_hour total_tweets
##   <chr>          <chr>                 <int>
## 1 FHNW           09                      344
## 2 FH_Graubuenden 11                      493
## 3 ZHAW           17                      580
## 4 bfh            08                      497
## 5 hes_so         10                      315
## 6 hslu           09                      380
## 7 ost_fh         08                       44
## 8 supsi_ch       11                      330
# Show most active hour overall
hour_with_most_tweets <- tweet_counts_by_hour_of_day %>%
  group_by(timeofday_hour) %>%
  summarize(total_tweets = sum(n)) %>%
  arrange(desc(total_tweets)) %>%
  slice_max(n = 1, order_by = total_tweets)

print(hour_with_most_tweets)
## # A tibble: 1 × 2
##   timeofday_hour total_tweets
##   <chr>                 <int>
## 1 11                     2356
# Count each tweet by university and weekday
tweet_counts_by_week_day <- tweets %>%
  group_by(university, day) %>%
  count() %>%
  arrange(university, day)

# Plot the number of tweets by university and day of the week
ggplot(
  tweet_counts_by_week_day,
  aes(
    x = day, y = n,
    color = university,
    group = university
  )
) +
  geom_line() +
  facet_wrap(~university) +
  labs(
    title = "Number of tweets by university and day of the week",
    x = "Day of the week",
    y = "Number of tweets"
  )

# Show most active days for each university
days_with_most_tweets_by_uni <- tweet_counts_by_week_day %>%
  group_by(university, day) %>%
  summarize(total_tweets = sum(n)) %>%
  group_by(university) %>%
  slice_max(n = 1, order_by = total_tweets)

print(days_with_most_tweets_by_uni)
## # A tibble: 8 × 3
## # Groups:   university [8]
##   university     day       total_tweets
##   <chr>          <ord>            <int>
## 1 FHNW           Tuesday            575
## 2 FH_Graubuenden Tuesday            751
## 3 ZHAW           Wednesday          636
## 4 bfh            Tuesday            651
## 5 hes_so         Tuesday            415
## 6 hslu           Thursday           603
## 7 ost_fh         Friday              65
## 8 supsi_ch       Friday             461
# Combine the most active hours and days for each university to show heatmap
heatmap_data <- tweets %>%
  group_by(timeofday_hour, day) %>%
  count() %>%
  ungroup()

# Plot heatmap and we can see clearly that the most tweets are posted during the working hours from monday to friday
ggplot(heatmap_data, aes(x = day, y = timeofday_hour, fill = n)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  labs(
    title = "Heatmap of Tweet Activity by Hour and Day",
    x = "Day",
    y = "Hour",
    fill = "Number of Tweets"
  ) +
  theme_minimal()

Release Strategies:

While universities have peak hours and days, the intervals between tweets vary significantly, indicating a more reactive strategy rather than a rigid release schedule. This variability suggests that the universities might be responding to real-time events or trends rather than sticking to a strict posting schedule.

Dispersion Measures and Analysis

To further understand the dispersion of tweets, we analyzed the time intervals between tweets using measures like the mean interval, standard deviation, and entropy. Understanding the variability helps in assessing how consistent or sporadic the posting behavior is. High variability indicates that the university does not follow a strict schedule and posts at irregular intervals, which could be a sign of a more reactive approach to social media. Higher entropy suggests less predictability in tweet timing, indicating a more dynamic and responsive posting strategy. This is crucial for understanding how universities might be reacting to real-time events or trends rather than following a predetermined schedule.Here are the results for selected universities:

  • FHNW: Significant variability in mean intervals (1291 to 6994 minutes) and stable entropy with higher values in some years, indicating fluctuating engagement and reactive posting at times.
  • FH Graubünden: High variability in mean intervals, notably increasing in 2023 (13889 minutes). Entropy peaks in 2021 (2.78), suggesting a highly reactive posting strategy.
  • ZHAW: Fluctuating mean intervals, with a significant increase in 2023 (35371 minutes). High entropy in 2020 (2.38), indicating dynamic and reactive posting periods.
  • BFH: Generally stable mean intervals with minor fluctuations. Entropy shows varying predictability, indicating a mix of stable and responsive tweeting.
  • HES-SO: High variability in mean intervals, especially in earlier years. Generally stable entropy with some fluctuations, indicating a mix of consistent and reactive posting.
  • HSLU: Fluctuating mean intervals with a decrease in 2021. Entropy indicates varying unpredictability, suggesting a mix of consistent and reactive posting.
  • OST-FH: Higher mean intervals in 2021 and 2022, indicating less frequent posting. High variability in entropy, particularly in 2022, suggesting flexible posting behavior.
  • SUPSI-CH: Significant fluctuations in mean intervals over the years. High entropy in 2020, indicating a dynamic and reactive posting strategy.
tweets <- tweets %>%
  arrange(university, created_at) %>%
  group_by(university) %>%
  mutate(time_interval = as.numeric(
    difftime(created_at, lag(created_at), units = "mins")
  ))

# Plotting the time intervals
universities <- unique(tweets$university)
for (uni in universities) {
  uni_filtered_data <- tweets %>%
    filter(university == uni)

  # Plot the distribution of time intervals
  print(ggplot(uni_filtered_data, aes(x = time_interval)) +
    geom_histogram(fill = "lightblue", bins = 30) +
    facet_wrap(~year) +
    labs(
      title = paste0("Distribution of time intervals between tweets - ", uni),
      x = "Time interval (minutes)",
      y = "Tweet count"
    ))

  # Plot posting day for each year for university because a intervall could be "short" but when a university only posts twice a year it seems active but it is actually not
  tweet_counts <- uni_filtered_data %>%
    group_by(tweet_month) %>%
    summarise(tweet_count = n())
  print("Tweet count for each month")
  print(ggplot(tweet_counts, aes(x = tweet_month, y = tweet_count)) +
    geom_line(color = "#F9C301") +
    geom_point(color = "#37556E", size = 3) +
    scale_x_date(date_labels = "%b %Y", date_breaks = "1 month") +
    labs(
      title = paste0("Monthly Tweet Activity - ", uni),
      x = "Month",
      y = "Tweet Count"
    ) +
    theme(axis.text.x = element_text(angle = 90, hjust = 1)))

  # Calculate dispersion measures for time intervals
  dispersion_measures <- uni_filtered_data %>%
    group_by(year) %>%
    summarise(
      mean_interval = mean(time_interval, na.rm = TRUE),
      sd_interval = sd(time_interval, na.rm = TRUE),
      entropy_interval = entropy::entropy(discretize(time_interval,
        numBins = 30, r = range(time_interval, na.rm = TRUE)
      ))
    )

  print(paste("Dispersion measures for", uni))
  print(dispersion_measures)

  # Line plot for mean interval and standard deviation over the years
  print(ggplot(dispersion_measures, aes(x = year)) +
    geom_line(aes(y = mean_interval, color = "Mean Interval")) +
    geom_point(aes(y = mean_interval, color = "Mean Interval")) +
    geom_line(aes(y = sd_interval, color = "Standard Deviation")) +
    geom_point(aes(y = sd_interval, color = "Standard Deviation")) +
    scale_y_continuous(sec.axis = dup_axis()) +
    labs(
      title = paste0("Mean Interval and Standard Deviation of Difftime Over Years - ", uni),
      x = "Year",
      y = "Minutes",
      color = "Measure"
    ) +
    theme_minimal())

  print(ggplot(dispersion_measures, aes(x = year)) +
    geom_line(aes(y = entropy_interval, color = "Entropy Interval")) +
    geom_point(aes(y = entropy_interval, color = "Entropy Interval")) +
    scale_y_continuous(sec.axis = dup_axis()) +
    labs(
      title = paste0("Entropy of Tweet Intervals Over Years - ", uni),
      x = "Year",
      y = "Entropy"
    ) +
    theme_minimal())
}

## [1] "Tweet count for each month"

## [1] "Dispersion measures for FHNW"
## # A tibble: 13 × 4
##     year mean_interval sd_interval entropy_interval
##    <dbl>         <dbl>       <dbl>            <dbl>
##  1  2011         6994.       4215.            0.693
##  2  2012         2120.       2893.            2.01 
##  3  2013         1291.       2668.            1.29 
##  4  2014         3238.       4378.            2.17 
##  5  2015         4695.       7894.            1.81 
##  6  2016         5382.       7382.            2.21 
##  7  2017         3696.       6566.            1.61 
##  8  2018         1514.       3440.            0.748
##  9  2019         1657.       2226.            1.53 
## 10  2020         1744.       1670.            2.36 
## 11  2021         2022.       2269.            2.36 
## 12  2022         1830.       1968.            2.35 
## 13  2023         2160.       2580.            2.15

## [1] "Tweet count for each month"

## [1] "Dispersion measures for FH_Graubuenden"
## # A tibble: 15 × 4
##     year mean_interval sd_interval entropy_interval
##    <dbl>         <dbl>       <dbl>            <dbl>
##  1  2009         2313.       3692.            2.00 
##  2  2010        33181.      63719.            1.39 
##  3  2011         6164.       8193.            1.97 
##  4  2012         2501.       4219.            1.64 
##  5  2013         3197.       5529.            1.76 
##  6  2014         1254.       3569.            0.578
##  7  2015         1088.       1521.            1.98 
##  8  2016         1448.       1726.            1.89 
##  9  2017         2227.       2462.            2.14 
## 10  2018         2384.       3034.            1.88 
## 11  2019         3019.       3146.            2.38 
## 12  2020         3011.       2764.            2.51 
## 13  2021         2913.       2723.            2.78 
## 14  2022         3558.       4138.            2.45 
## 15  2023        13889.       7145.            1.10

## [1] "Tweet count for each month"

## [1] "Dispersion measures for ZHAW"
## # A tibble: 12 × 4
##     year mean_interval sd_interval entropy_interval
##    <dbl>         <dbl>       <dbl>            <dbl>
##  1  2012         1178.       1486.            2.09 
##  2  2013         1291.       1963.            1.35 
##  3  2014         1098.       1884.            1.29 
##  4  2015         2032.       2341.            2.00 
##  5  2016         1570.       1847.            2.00 
##  6  2017         1524.       1580.            2.21 
##  7  2018         1490.       1552.            2.42 
##  8  2019         1448.       1570.            2.25 
##  9  2020         2527.       2917.            2.38 
## 10  2021         3160.       2988.            2.43 
## 11  2022         5463.       6825.            2.06 
## 12  2023        35371.      50020.            0.693

## [1] "Tweet count for each month"

## [1] "Dispersion measures for bfh"
## # A tibble: 12 × 4
##     year mean_interval sd_interval entropy_interval
##    <dbl>         <dbl>       <dbl>            <dbl>
##  1  2012         1483.       2261.             1.68
##  2  2013         1265.       1888.             1.57
##  3  2014         1055.       1526.             1.78
##  4  2015         2209.       2797.             2.03
##  5  2016         3301.       4155.             2.19
##  6  2017         3149.       3790.             2.35
##  7  2018         2510.       2948.             2.10
##  8  2019         2059.       2686.             1.74
##  9  2020         1968.       2164.             1.97
## 10  2021         1484.       1909.             1.52
## 11  2022         1856.       2106.             2.31
## 12  2023         2800.       3313.             2.10

## [1] "Tweet count for each month"

## [1] "Dispersion measures for hes_so"
## # A tibble: 14 × 4
##     year mean_interval sd_interval entropy_interval
##    <dbl>         <dbl>       <dbl>            <dbl>
##  1  2010         9353.      10637.            2.54 
##  2  2011         4852.       6117.            2.09 
##  3  2012         4983.       6238.            2.38 
##  4  2013         7294.      17630.            1.22 
##  5  2014         6485.       8252.            2.27 
##  6  2015         4772.      10362.            1.34 
##  7  2016         5423.       9275.            1.78 
##  8  2017         8087.      14932.            1.79 
##  9  2018         3107.       6086.            1.65 
## 10  2019         2622.       7755.            0.699
## 11  2020         1521.       2343.            1.66 
## 12  2021         2956.       4038.            1.97 
## 13  2022         2427.       2218.            2.74 
## 14  2023         1860.       1787.            2.11

## [1] "Tweet count for each month"

## [1] "Dispersion measures for hslu"
## # A tibble: 8 × 4
##    year mean_interval sd_interval entropy_interval
##   <dbl>         <dbl>       <dbl>            <dbl>
## 1  2016         2192.       3286.             1.79
## 2  2017         1702.       2338.             1.91
## 3  2018         1261.       1765.             1.76
## 4  2019         1960.       2304.             2.20
## 5  2020         1095.       1423.             2.08
## 6  2021          569.        598.             2.41
## 7  2022          844.        905.             2.16
## 8  2023         1909.       1524.             2.58

## [1] "Tweet count for each month"

## [1] "Dispersion measures for ost_fh"
## # A tibble: 3 × 4
##    year mean_interval sd_interval entropy_interval
##   <dbl>         <dbl>       <dbl>            <dbl>
## 1  2020         1478.       1653.             2.20
## 2  2021         5272.       7736.             2.06
## 3  2022         8470.      15109.             1.77

## [1] "Tweet count for each month"

## [1] "Dispersion measures for supsi_ch"
## # A tibble: 11 × 4
##     year mean_interval sd_interval entropy_interval
##    <dbl>         <dbl>       <dbl>            <dbl>
##  1  2013         2536.       3892.            1.85 
##  2  2014         5283.      12951.            1.15 
##  3  2015         8346.      12979.            1.99 
##  4  2016         2634.       5705.            1.48 
##  5  2017         1396.       3394.            0.772
##  6  2018         1648.       3127.            1.45 
##  7  2019         2302.       3156.            1.81 
##  8  2020         3093.       3234.            2.34 
##  9  2021         1769.       2297.            1.49 
## 10  2022         2677.       3112.            1.88 
## 11  2023         2372.       2840.            1.96

# Descriptive statistics of time intervals
summary(tweets$time_interval)
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.     NA's 
##      0.0    148.2   1128.8   2097.6   2428.3 220707.0        8

Conclusion:

The data indicates that Swiss Universities of Applied Sciences primarily tweet during working hours and show distinct patterns in their most active days and hours. Workers tend to be more productive and active on Twitter in the morning, with a noticeable decline in activity around midday and towards the end of the week. The dispersion measures support this interpretation, showing varying levels of unpredictability across different years and universities.

Question 2: What are the tweets about and how do other Twitter users react to them (likes, etc.)?

Data Preprocessing

The tweets are filtered based on language, focusing on German, French, Italian, and English. These languages where choosen based on the popularity over all tweet languages. It removes common and extended stopwords, including non-meaningful words like ‘amp’ (which represents ‘&’) and ‘rt’ (commonly found in retweets). The extended stopwords list includes hashtags and URLs related to specific Swiss universities.

Next, the code processes tweets separately for each language. This involves creating tokens from the text, removing unwanted characters, stemming words, and creating n-grams. The processed tokens are then used to create Document-Feature Matrices (DFMs) for each language.

langs <- c("de", "fr", "it", "en")
tweets_filtered <- tweets %>%
  filter(lang %in% langs)

# Define extended stopwords (outside loop for efficiency)
# Remove 'amp' as it is not meaningful because its only & symbol
# Remove 'rt' because it is an word e.g 'engagiert'.
extended_stopwords <- c(
  "#fhnw", "#bfh", "@htw_chur", "#hslu", "#supsi", "#sups",
  "amp", "rt", "fr", "ber", "t.co", "https", "http", "www", "com", "html"
)
# Create separate DFMs for each language
dfm_list <- list()
for (sel_lang in langs) {
  # Subset tweets for the current language
  tweets_lang <- tweets_filtered %>%
    filter(lang == sel_lang)
  # Create tokens for the current language
  stopwords_lang <- stopwords(sel_lang)
  # Create tokens for all tweets:
  # - create corpus and tokens because tokensonly works on character, corpus, list, tokens, tokens_xptr objects.
  # - create tokens and remove: URLS, Punctuation, Numbers, Symbols, Separators
  # - transform to lowercase
  # - Stem all words
  # - Create n-grams of any length (not includinf bigrams and trigrams but they are shown later)
  # - It is important to remove the stopwords after stemming the words because we remove the endings from some stem words
  tokens_lang <- tweets_lang %>%
    corpus(text_field = "full_text_emojis") %>%
    tokens(
      remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE,
      remove_url = TRUE, remove_separators = TRUE
    ) %>%
    tokens_tolower() %>%
    tokens_wordstem(lang = sel_lang) %>%
    tokens_ngrams(n = 1) %>%
    tokens_select(
      pattern =
        c(stopwords_lang, extended_stopwords), selection = "remove"
    )
  # Create DFM for the current language
  dfm_list[[sel_lang]] <- dfm(tokens_lang)
}

Content Analysis

Tweets were analyzed across four languages: German, French, Italian, and English. Each university tends to tweet predominantly in one or more languages, reflecting the linguistic diversity of Switzerland.

  • German: Predominantly used by BFH and FHNW. Common words include “neu” (new), “mehr” (more), “schweiz” (Switzerland), and “studier” (study).
  • French: Primarily used by HES-SO. Common words include “projet” (project), “recherch” (research), and “tudi” (study).
  • Italian: Mostly used by SUPSI. Common words include “nuov” (new), “progett” (project), and “student” (student).
  • English: Frequently used by HSLU. Common words include “student”, “project”, “thank”, and “university”. ### Word Frequency:
  • English: The most frequent words were “student”, “new”, “@hslu”, “university”, “project”, “thank”, “@zhaw”, “day”, “science”, and “today”.
  • German: The most frequent words were “neu”, “mehr”, “schweiz”, “werd”, “all”, “studier”, “heut”, “hochschul”, “bfh”, and “jahr”.
  • Italian: The most frequent words were “nuov”, “sups”, “progett”, “student”, “present”, “info”, “iscrizion”, “cors”, “ricerc”, and “formazion”.
  • French: The most frequent words were “hes-so”, “right”, “arrow”, “dan”, “projet”, “a”, “tudi”, “haut”, “col”, and “@hes_so”.

It’s important to note that some words like “right” 👉 and “arrow” ➡️ are actually names of parsed emojis and not written words in the tweets.

Word clouds for each language visually show the most common words, emphasizing their relative frequencies. The analysis revealed that universities tweet in multiple languages, reflecting the linguistic diversity of their audience. But still we can order the universities by language. For example BFH and FHNW are tweeting in german, HES-SO in french, SUPSI in italian and HSLU in english.

# Word Frequencies & Visualization
words_freqs_en <- sort(colSums(dfm_list$en), decreasing = TRUE)
head(words_freqs_en, 20)
##     student       @hslu         new       @zhaw     project     univers 
##         117          90          86          73          69          69 
##         day       thank switzerland       swiss      scienc       innov 
##          68          67          67          65          64          61 
##         now       studi       great       today        join       @fhnw 
##          54          54          53          52          48          47 
##    research   @supsi_ch 
##          47          47
wordcloud(
  words = names(words_freqs_en),
  freq = words_freqs_en,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

words_freqs_de <- sort(colSums(dfm_list$de), decreasing = TRUE)
head(words_freqs_de, 20)
##       neu      mehr   schweiz      werd       all   studier      heut hochschul 
##      1680      1133       988       781       731       729       696       621 
##     statt       bfh      jahr      bern   digital     thema    findet     knnen 
##       607       598       539       533       525       524       520       517 
##   projekt     studi     welch    arbeit 
##       488       484       464       444
wordcloud(
  words = names(words_freqs_de),
  freq = words_freqs_de,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

words_freqs_it <- sort(colSums(dfm_list$it), decreasing = TRUE)
head(words_freqs_it, 20)
##        nuov        sups     progett  #supsinews        info     student 
##         214         208         176         151         147         146 
##     present   iscrizion        cors #supsievent      ricerc   formazion 
##         145         144         142         139         137         135 
##       scopr      inform      diplom    bachelor       apert        tutt 
##         129         120         119         113         111         109 
##      master      dipart 
##         107         105
wordcloud(
  words = names(words_freqs_it),
  freq = words_freqs_it,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

# It seems that there are some english words but I think this are emojis
words_freqs_fr <- sort(colSums(dfm_list$fr), decreasing = TRUE)
head(words_freqs_fr, 20)
##    hes-so     right     arrow    projet       dan         a      tudi      haut 
##       515       433       324       251       249       234       200       182 
##       col   @hes_so    dcouvr @hessoval      book   #hes_so  recherch      open 
##       155       150       130       129       123       119       117       117 
##      mast     suiss      plus   nouveau 
##       111       110       105        99
wordcloud(
  words = names(words_freqs_fr),
  freq = words_freqs_fr,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

# University-specific Analysis
for (uni in unique(tweets$university)) {
  # Subset tweets for the current language
  uni_tweets <- tweets %>%
    filter(university == uni)

  tokens_lang <- uni_tweets %>%
    corpus(text_field = "full_text_emojis") %>%
    tokens(
      remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE,
      remove_url = TRUE, remove_separators = TRUE
    ) %>%
    tokens_tolower() %>%
    tokens_wordstem() %>%
    tokens_ngrams(n = 1) %>%
    tokens_select(
      pattern =
        c(
          stopwords("en"), stopwords("de"),
          stopwords("fr"), stopwords("it"), extended_stopwords
        ), selection = "remove"
    )
  # Create Data Frame Matrix for uni with all languages
  uni_dfm <- dfm(tokens_lang)
  # Word Frequencies
  uni_word_freqs <- sort(colSums(uni_dfm), decreasing = TRUE)
  # print most common words: the emoji right are used often
  print(paste("Most common words for", uni, ":"))
  print(head(uni_word_freqs, 20))
  wordcloud(
    words = names(uni_word_freqs),
    freq = uni_word_freqs,
    max.words = 200,
    random.order = FALSE,
    colors = brewer.pal(8, "Dark2")
  )
}
## [1] "Most common words for FHNW :"
##       @fhnwbusi            fhnw        @hsafhnw            mehr       hochschul 
##             380             320             261             233             214 
##    @fhnwtechnik           @fhnw            neue          campus            heut 
##             207             205             160             149             149 
## @fhnwpsychologi      studierend           neuen         projekt         schweiz 
##             138             114             108             101             101 
##           knnen           basel      entwickelt  brugg-windisch            prof 
##              97              95              85              83              82

## [1] "Most common words for FH_Graubuenden :"
##            chur        #htwchur           statt             htw          findet 
##             367             360             307             292             276 
##            mehr     blogbeitrag           #fhgr     #infoanlass       graubnden 
##             230             203             182             179             177 
##      infoanlass           #chur           neuen            heut        #studium 
##             157             128             128             127             123 
##           manag @suedostschweiz            neue              fh       tourismus 
##             111             111             103              98              96

## [1] "Most common words for ZHAW :"
##             zhaw            @zhaw @engineeringzhaw            #zhaw 
##              281              257              245              187 
##               cc             neue             dank  @iam_winterthur 
##              185              152              147              146 
##             heut             mehr            knnen            zeigt 
##              141              138              137              133 
##       winterthur        schweizer              via        @sml_zhaw 
##              126              125              124              120 
##      #zhawimpact          schweiz            studi             gibt 
##              120              115              115              105

## [1] "Most common words for bfh :"
##             bfh            bern          berner            mehr       @bfh_hesb 
##             606             282             231             213             208 
##            neue           thema   fachhochschul        @hkb_bfh           knnen 
##             205             199             166             117             109 
##         projekt #knoten_maschen            biel            heut           innen 
##             109             107             102             102             102 
##        anmelden       schweizer           neuen         schweiz           statt 
##             101             100              96              92              89

## [1] "Most common words for hes_so :"
##      hes-so       right       arrow      projet         dan     tudiant 
##         532         445         331         249         248         192 
##        haut     @hes_so        cole @hessovalai        book      master 
##         178         172         149         133         124         123 
##     #hes_so        open    recherch       suiss        plus     magnifi 
##         123         123         116         109         103          97 
##       glass        tilt 
##          97          97

## [1] "Most common words for hslu :"
##           @hslu          luzern            mehr       hochschul          depart 
##             368             334             258             198             175 
## #hsluinformatik            heut            neue       schweizer           zeigt 
##             173             171             165             146             133 
##          design           knnen           studi         schweiz            gibt 
##             132             127             125             118             114 
##            jahr              ab           neuen         projekt          arbeit 
##             107             105             105             103              99

## [1] "Most common words for ost_fh :"
## #ostschweizerfachhochschul                    @ost_fh 
##                         73                         64 
##                        ost                   @ozg_ost 
##                         55                         28 
##                       mehr                       neue 
##                         26                         22 
##                  st.gallen                 rapperswil 
##                         17                         17 
##                      neuen                 ostschweiz 
##                         17                         15 
##                #informatik                    podcast 
##                         15                         15 
##                     detail                       gibt 
##                         15                         14 
##                       #ost                   menschen 
##                         12                         12 
##                      thema                     campus 
##                         12                         12 
##                 @eastdigit                   #podcast 
##                         12                         12

## [1] "Most common words for supsi_ch :"
##        supsi     #supsiev    #supsinew         info     studenti    formazion 
##          231          183          168          148          133          132 
##     progetto    @supsi_ch   iscrizioni       master        nuovo     bachelor 
##          126          126          117          117          116          115 
##        right        innov dipartimento           pi         oggi @usi_univers 
##          114          109          104          103          102          102 
## informazioni        manag 
##           97           95

Userreaction Analysis

To understand user reactions, the code calculates a ‘weighted engagement’ metric, combining favorite and retweet counts. The tweets with the highest engagement are analyzed by hour and day to identify patterns in user interaction.

Emgagement Time of Most Engaged Tweets:

The provided bar plots show the average engagement of tweets by hour of the day and by day of the week. Each bar represents the average engagement score for tweets posted during specific hours or on specific days.

Peak Engagement:

  • The highest average engagement occurs around 20:00 (8 PM), with a significant spike.
  • Other notable peaks are at 16:00 (4 PM) and 19:00 (7 PM), showing elevated engagement during these hours.
  • Tuesday shows the highest average engagement, indicating that tweets posted on this day receive the most interaction from users.
  • Friday and Sunday also have relatively high engagement, suggesting that these days are good for posting tweets.

Consistent Engagement:

  • Engagement remains relatively consistent throughout the day from 06:00 (6 AM) to 17:00 (5 PM), with moderate fluctuations.
  • Monday and Thursday show consistent engagement scores that are moderate compared to the peak days. Engagement on these days is lower than Tuesday but higher than Wednesday and Saturday.

Lower Engagement:

  • There is a noticeable dip in engagement early in the morning around 00:00 (midnight) and 23:00 (11 PM).
  • Engagement is also relatively lower around 22:00 (10 PM).
  • Wednesday and Saturday have the lowest average engagement scores, indicating that tweets posted on these days receive less interaction compared to other days.
# Calculate a 'weighted engagement' metric
tweets <- tweets %>%
  mutate(
    weighted_engagement = favorite_count * 1 + retweet_count * 2
  )

# Identify tweets with the highest weighted engagement
most_engaged_tweets <- tweets %>%
  arrange(desc(weighted_engagement)) %>%
  head(1000) # Top 1000 for analysis

# Calculate average engagement by hour
engagement_hour <- most_engaged_tweets %>%
  group_by(timeofday_hour) %>%
  summarise(avg_engagement = mean(weighted_engagement, na.rm = TRUE))

ggplot(engagement_hour, aes(x = timeofday_hour, y = avg_engagement)) +
  geom_bar(stat = "identity", fill = "blue") +
  labs(
    title = "Average Engagement by Hour",
    x = "Hour of Day",
    y = "Average Engagement"
  ) +
  theme_minimal()

# Calculate average engagement by day
engagement_day <- most_engaged_tweets %>%
  group_by(day) %>%
  summarise(avg_engagement = mean(weighted_engagement, na.rm = TRUE))

# Plot average engagement by day
ggplot(engagement_day, aes(x = day, y = avg_engagement)) +
  geom_bar(stat = "identity", fill = "blue") +
  labs(
    title = "Average Engagement by Day of the Week",
    x = "Day of the Week",
    y = "Average Engagement"
  ) +
  theme_minimal()

Analyse the content of the most liked tweets

The most common words in the most liked tweets include “mehr” (more), “neue” (new), “schweiz” (Switzerland), “heut” (today), and “hochschul” (university). These words suggest that tweets focusing on new developments, events happening today, and general updates about Switzerland and universities tend to receive more likes.

# Preprocessing content of most liked tweets
tokens_most_engaged <- most_engaged_tweets %>%
  corpus(text_field = "full_text_emojis") %>%
  tokens(
    remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE,
    remove_url = TRUE, remove_separators = TRUE
  ) %>%
  tokens_tolower() %>%
  tokens_wordstem(lang = sel_lang) %>%
  tokens_ngrams(n = 1) %>%
  tokens_select(
    pattern =
      c(
        stopwords("en"), stopwords("de"),
        stopwords("fr"), stopwords("it"), extended_stopwords
      ), selection = "remove"
  )
tokens_most_engaged_dfm <- dfm(tokens_most_engaged)
freqs_most_engaged <- sort(colSums(tokens_most_engaged_dfm), decreasing = TRUE)
# print most common words: the emoji right are used often
head(freqs_most_engaged, 20)
##            mehr            neue         schweiz       schweizer           right 
##              81              67              48              47              46 
##            heut           zeigt #hsluinformatik           studi            zhaw 
##              44              41              40              39              39 
##          hes-so           knnen           neuen       hochschul          campus 
##              38              38              36              34              33 
##           innov            gibt              ab      entwickelt             bfh 
##              31              30              30              30              30
set.seed(123)
wordcloud(
  words = names(freqs_most_engaged),
  freq = freqs_most_engaged,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

Question 3: How do the university tweets differ in terms of content, style, emotions, etc?

Content Analysis (Word Clouds)

Each university shows distinct patterns in the words and emojis used in their tweets. The analysis involved creating word clouds and identifying the most common words and emojis.

Most Common Words:

  • FHNW: Common words include “mehr” (more), “hochschul” (university), and “studierend” (students).
  • FH Graubünden: Words like “chur” (location), “htw” (university abbreviation), and “busi” (business) are frequent.
  • ZHAW: Frequent words include “zhaw” (university abbreviation), “engineering”, and “schweizer” (Swiss).
  • BFH: Common terms are “bern” (location), “projekt” (project), and “hochschul” (university).
  • HES-SO: Words like “hes-so” (university abbreviation), “projet” (project), and “tudiant” (student) are prevalent.
  • HSLU: Common words are “hslu” (university abbreviation), “luzern” (location), and “hochschul” (university).
  • OST-FH: Frequent terms include “ost” (university abbreviation), “st.gallen” (location), and “podcast”.
  • SUPSI-CH: Words like “supsi” (university abbreviation), “formazion” (education), and “progetto” (project) are prevalent.

Most Common Emojis: - FHNW: Top emojis include 👉 (backhand index pointing right), 💛 (yellow heart), and 🖤 (black heart). - FH Graubünden: Frequent emojis are 🎉 (party popper), 😃 (grinning face with big eyes), and 😊 (blush). - ZHAW: Common emojis include 👉 (backhand index pointing right), ⚡ (high voltage), and 😉 (wink). - BFH: Top emojis are 👉 (backhand index pointing right), 🔋 (battery), and 👇 (backhand index pointing down). - HES-SO: Common emojis are 👉 (backhand index pointing right), 🎓 (graduation cap), and ➡ (arrow right). - HSLU: Top emojis include 🎓 (graduation cap), 👨 (man), and 🚀 (rocket). - OST-FH: Frequent emojis are 👉 (backhand index pointing right), ➡ (arrow right), and 🎓 (graduation cap). - SUPSI-CH: Common emojis include 👉 (backhand index pointing right), 🎓 (graduation cap), and 🎉 (party popper).

for (uni in unique(tweets$university)) {
  uni_tweets <- tweets %>%
    filter(university == uni, lang %in% langs)
  tokens_uni <- uni_tweets %>%
    corpus(text_field = "full_text_emojis") %>%
    tokens(
      remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE,
      remove_url = TRUE, remove_separators = TRUE
    ) %>%
    tokens_tolower() %>%
    tokens_wordstem() %>%
    tokens_ngrams(n = 1) %>%
    tokens_select(
      pattern =
        c(
          stopwords("en"), stopwords("de"),
          stopwords("fr"), stopwords("it"), extended_stopwords
        ), selection = "remove"
    )
  uni_dfm <- dfm(tokens_uni)
  freqs_uni <- sort(colSums(uni_dfm), decreasing = TRUE)
  # print most common words: the emoji right are used often
  print(paste("Most common words for", uni, ":"))
  print(head(freqs_uni, 20))
  set.seed(123)
  wordcloud(
    words = names(freqs_uni),
    freq = freqs_uni,
    max.words = 200,
    random.order = FALSE,
    colors = brewer.pal(8, "Dark2")
  )
  # Analyze Top Emojis by University
  emoji_count_per_university <- uni_tweets %>%
    top_n_emojis(full_text)
  print(paste("Top emojis for", uni, ":"))
  print(emoji_count_per_university)

  print(emoji_count_per_university %>%
    mutate(emoji_name = reorder(emoji_name, n)) %>%
    ggplot(aes(n, emoji_name)) +
    geom_col() +
    labs(x = "Count", y = NULL, title = "Top 20 Emojis Used"))
}
## [1] "Most common words for FHNW :"
##       @fhnwbusi            fhnw        @hsafhnw            mehr       hochschul 
##             377             320             257             233             214 
##    @fhnwtechnik           @fhnw            neue          campus            heut 
##             205             203             160             149             149 
## @fhnwpsychologi      studierend           neuen         projekt         schweiz 
##             138             114             108             101             101 
##           knnen           basel      entwickelt  brugg-windisch            prof 
##              97              94              85              83              82

## [1] "Top emojis for FHNW :"
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category        n
##    <chr>                         <chr>   <chr>             <int>
##  1 backhand_index_pointing_right 👉      People & Body        57
##  2 yellow_heart                  💛      Smileys & Emotion    34
##  3 black_heart                   🖤      Smileys & Emotion    32
##  4 woman                         👩      People & Body        28
##  5 clap                          👏      People & Body        17
##  6 flag_Switzerland              🇨🇭      Flags                17
##  7 man                           👨      People & Body        17
##  8 microscope                    🔬      Objects              15
##  9 computer                      💻      Objects              14
## 10 robot                         🤖      Smileys & Emotion    14
## 11 graduation_cap                🎓      Objects              13
## 12 school                        🏫      Travel & Places      13
## 13 face_with_medical_mask        😷      Smileys & Emotion    12
## 14 raised_hands                  🙌      People & Body        12
## 15 female_sign                   ♀️       Symbols              10
## 16 star_struck                   🤩      Smileys & Emotion    10
## 17 trophy                        🏆      Activities           10
## 18 party_popper                  🎉      Activities            9
## 19 woman_scientist               👩‍🔬      People & Body         9
## 20 sun_with_face                 🌞      Travel & Places       8

## [1] "Most common words for FH_Graubuenden :"
##            chur        #htwchur           statt             htw          findet 
##             365             355             305             290             274 
##            mehr     blogbeitrag           #fhgr     #infoanlass       graubnden 
##             230             203             182             177             176 
##      infoanlass           #chur           neuen            heut        #studium 
##             157             128             128             127             123 
##           manag @suedostschweiz            neue              fh       tourismus 
##             110             108             103              98              96

## [1] "Top emojis for FH_Graubuenden :"
## # A tibble: 20 × 4
##    emoji_name                      unicode emoji_category        n
##    <chr>                           <chr>   <chr>             <int>
##  1 party_popper                    🎉      Activities           19
##  2 grinning_face_with_big_eyes     😃      Smileys & Emotion    15
##  3 blush                           😊      Smileys & Emotion     9
##  4 smiling_face_with_sunglasses    😎      Smileys & Emotion     8
##  5 bulb                            💡      Objects               7
##  6 flexed_biceps                   💪      People & Body         7
##  7 +1                              👍      People & Body         6
##  8 camera_flash                    📸      Objects               6
##  9 four_leaf_clover                🍀      Animals & Nature      6
## 10 grinning_face_with_smiling_eyes 😄      Smileys & Emotion     6
## 11 heart_eyes                      😍      Smileys & Emotion     6
## 12 hugs                            🤗      Smileys & Emotion     6
## 13 grinning                        😀      Smileys & Emotion     5
## 14 computer                        💻      Objects               4
## 15 female_sign                     ♀️       Symbols               4
## 16 graduation_cap                  🎓      Objects               4
## 17 robot                           🤖      Smileys & Emotion     4
## 18 backhand_index_pointing_down    👇      People & Body         3
## 19 lady_beetle                     🐞      Animals & Nature      3
## 20 ocean                           🌊      Travel & Places       3

## [1] "Most common words for ZHAW :"
##             zhaw            @zhaw @engineeringzhaw            #zhaw 
##              278              254              244              184 
##               cc             neue             dank  @iam_winterthur 
##              183              151              146              145 
##             heut             mehr            knnen            zeigt 
##              141              138              137              133 
##       winterthur        schweizer              via        @sml_zhaw 
##              126              125              124              120 
##      #zhawimpact          schweiz            studi             gibt 
##              120              115              115              105

## [1] "Top emojis for ZHAW :"
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category        n
##    <chr>                         <chr>   <chr>             <int>
##  1 backhand_index_pointing_right 👉      People & Body        21
##  2 high_voltage                  ⚡      Travel & Places      11
##  3 wink                          😉      Smileys & Emotion     9
##  4 clap                          👏      People & Body         5
##  5 flag_Switzerland              🇨🇭      Flags                 5
##  6 rocket                        🚀      Travel & Places       5
##  7 +1                            👍      People & Body         4
##  8 arrow_right                   ➡️       Symbols               4
##  9 bug                           🐛      Animals & Nature      3
## 10 computer                      💻      Objects               3
## 11 flexed_biceps                 💪      People & Body         3
## 12 man                           👨      People & Body         3
## 13 bangbang                      ‼️       Symbols               2
## 14 camera_flash                  📸      Objects               2
## 15 dark_skin_tone                🏿      Component             2
## 16 exclamation                   ❗      Symbols               2
## 17 female_sign                   ♀️       Symbols               2
## 18 four_leaf_clover              🍀      Animals & Nature      2
## 19 green_salad                   🥗      Food & Drink          2
## 20 grinning                      😀      Smileys & Emotion     2

## [1] "Most common words for bfh :"
##             bfh            bern          berner            mehr       @bfh_hesb 
##             606             280             231             213             207 
##            neue           thema   fachhochschul        @hkb_bfh           knnen 
##             205             199             166             117             109 
##         projekt #knoten_maschen            biel            heut           innen 
##             109             107             102             102             102 
##        anmelden       schweizer           neuen         schweiz           statt 
##             101             100              96              92              89

## [1] "Top emojis for bfh :"
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category        n
##    <chr>                         <chr>   <chr>             <int>
##  1 backhand_index_pointing_right 👉      People & Body        49
##  2 battery                       🔋      Objects              16
##  3 backhand_index_pointing_down  👇      People & Body        12
##  4 woman                         👩      People & Body        12
##  5 palm_tree                     🌴      Animals & Nature     11
##  6 bulb                          💡      Objects              10
##  7 computer                      💻      Objects              10
##  8 evergreen_tree                🌲      Animals & Nature     10
##  9 graduation_cap                🎓      Objects              10
## 10 party_popper                  🎉      Activities           10
## 11 robot                         🤖      Smileys & Emotion    10
## 12 rocket                        🚀      Travel & Places      10
## 13 clap                          👏      People & Body         9
## 14 coconut                       🥥      Food & Drink          9
## 15 date                          📅      Objects               9
## 16 deciduous_tree                🌳      Animals & Nature      9
## 17 flag_Switzerland              🇨🇭      Flags                 9
## 18 automobile                    🚗      Travel & Places       8
## 19 clinking_glasses              🥂      Food & Drink          8
## 20 seedling                      🌱      Animals & Nature      8

## [1] "Most common words for hes_so :"
##      hes-so       right       arrow      projet         dan     tudiant 
##         529         445         331         249         248         192 
##        haut     @hes_so        cole @hessovalai        book      master 
##         177         170         149         133         124         123 
##     #hes_so        open    recherch       suiss        plus     magnifi 
##         123         123         116         109         103          97 
##       glass        tilt 
##          97          97

## [1] "Top emojis for hes_so :"
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category      n
##    <chr>                         <chr>   <chr>           <int>
##  1 arrow_right                   ➡️       Symbols           320
##  2 arrow_heading_down            ⤵️       Symbols           246
##  3 book                          📖      Objects           115
##  4 mag_right                     🔎      Objects            97
##  5 mega                          📣      Objects            53
##  6 clapper                       🎬      Objects            38
##  7 NEW_button                    🆕      Symbols            35
##  8 computer                      💻      Objects            35
##  9 microscope                    🔬      Objects            32
## 10 bulb                          💡      Objects            29
## 11 police_car_light              🚨      Travel & Places    28
## 12 backhand_index_pointing_right 👉      People & Body      27
## 13 graduation_cap                🎓      Objects            23
## 14 studio_microphone             🎙️       Objects            23
## 15 clap                          👏      People & Body      21
## 16 date                          📅      Objects            17
## 17 medal_sports                  🏅      Activities         15
## 18 memo                          📝      Objects            15
## 19 woman                         👩      People & Body      15
## 20 flag_Switzerland              🇨🇭      Flags              14

## [1] "Most common words for hslu :"
##           @hslu          luzern            mehr       hochschul          depart 
##             363             332             258             198             175 
## #hsluinformatik            heut            neue       schweizer           zeigt 
##             172             171             165             146             133 
##          design           knnen           studi         schweiz            gibt 
##             132             127             125             118             114 
##            jahr              ab           neuen         projekt          arbeit 
##             106             105             105             103              99

## [1] "Top emojis for hslu :"
## # A tibble: 20 × 4
##    emoji_name                   unicode emoji_category        n
##    <chr>                        <chr>   <chr>             <int>
##  1 sparkles                     ✨      Activities           29
##  2 flag_Switzerland             🇨🇭      Flags                20
##  3 party_popper                 🎉      Activities           12
##  4 rocket                       🚀      Travel & Places      12
##  5 partying_face                🥳      Smileys & Emotion    11
##  6 bottle_with_popping_cork     🍾      Food & Drink          9
##  7 Christmas_tree               🎄      Activities            7
##  8 clap                         👏      People & Body         7
##  9 star                         ⭐      Travel & Places       7
## 10 glowing_star                 🌟      Travel & Places       6
## 11 +1                           👍      People & Body         5
## 12 bulb                         💡      Objects               5
## 13 clinking_glasses             🥂      Food & Drink          5
## 14 smiling_face_with_sunglasses 😎      Smileys & Emotion     5
## 15 camera_flash                 📸      Objects               4
## 16 four_leaf_clover             🍀      Animals & Nature      4
## 17 musical_notes                🎶      Objects               4
## 18 person_running               🏃      People & Body         4
## 19 raised_hands                 🙌      People & Body         4
## 20 robot                        🤖      Smileys & Emotion     4

## [1] "Most common words for ost_fh :"
## #ostschweizerfachhochschul                    @ost_fh 
##                         72                         63 
##                        ost                   @ozg_ost 
##                         55                         28 
##                       mehr                       neue 
##                         26                         22 
##                  st.gallen                 rapperswil 
##                         17                         17 
##                      neuen                 ostschweiz 
##                         17                         15 
##                #informatik                    podcast 
##                         15                         15 
##                     detail                       gibt 
##                         15                         14 
##                       #ost                   menschen 
##                         12                         12 
##                      thema                     campus 
##                         12                         12 
##                 @eastdigit                   #podcast 
##                         12                         12

## [1] "Top emojis for ost_fh :"
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category        n
##    <chr>                         <chr>   <chr>             <int>
##  1 graduation_cap                🎓      Objects               3
##  2 man                           👨      People & Body         2
##  3 man_student                   👨‍🎓      People & Body         2
##  4 rocket                        🚀      Travel & Places       2
##  5 snowflake                     ❄️       Travel & Places       2
##  6 backhand_index_pointing_right 👉      People & Body         1
##  7 brain                         🧠      People & Body         1
##  8 chocolate_bar                 🍫      Food & Drink          1
##  9 clapper                       🎬      Objects               1
## 10 eyes                          👀      People & Body         1
## 11 fire                          🔥      Travel & Places       1
## 12 flexed_biceps                 💪      People & Body         1
## 13 grinning                      😀      Smileys & Emotion     1
## 14 heart_eyes_cat                😻      Smileys & Emotion     1
## 15 high_voltage                  ⚡      Travel & Places       1
## 16 mantelpiece_clock             🕰️       Travel & Places       1
## 17 sleeping                      😴      Smileys & Emotion     1
## 18 slightly_smiling_face         🙂      Smileys & Emotion     1
## 19 sun                           ☀️       Travel & Places       1
## 20 woman                         👩      People & Body         1

## [1] "Most common words for supsi_ch :"
##        supsi     #supsiev    #supsinew         info    formazion     studenti 
##          224          175          167          148          132          132 
##     progetto    @supsi_ch   iscrizioni       master        nuovo     bachelor 
##          126          123          117          117          116          113 
##        right        innov dipartimento           pi         oggi @usi_univers 
##          113          108          103          103          102           98 
## informazioni        manag 
##           97           94

## [1] "Top emojis for supsi_ch :"
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category        n
##    <chr>                         <chr>   <chr>             <int>
##  1 arrow_right                   ➡️       Symbols              84
##  2 backhand_index_pointing_right 👉      People & Body        24
##  3 arrow_forward                 ▶️       Symbols              18
##  4 graduation_cap                🎓      Objects              17
##  5 bulb                          💡      Objects              10
##  6 flag_Switzerland              🇨🇭      Flags                 9
##  7 rocket                        🚀      Travel & Places       9
##  8 party_popper                  🎉      Activities            8
##  9 clap                          👏      People & Body         7
## 10 exclamation                   ❗      Symbols               5
## 11 SOON_arrow                    🔜      Symbols               4
## 12 grinning_face_with_big_eyes   😃      Smileys & Emotion     4
## 13 Christmas_tree                🎄      Activities            3
## 14 camera_flash                  📸      Objects               3
## 15 computer                      💻      Objects               3
## 16 movie_camera                  🎥      Objects               3
## 17 pushpin                       📌      Objects               3
## 18 rainbow                       🌈      Travel & Places       3
## 19 studio_microphone             🎙️       Objects               3
## 20 woman                         👩      People & Body         3

# Generate general tokens for bigram and trigram analysis
tokens <- tweets %>%
  corpus(text_field = "full_text_emojis") %>%
  tokens(
    remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE,
    remove_url = TRUE, remove_separators = TRUE
  ) %>%
  tokens_tolower() %>%
  tokens_wordstem() %>%
  tokens_select(
    pattern =
      c(
        stopwords("en"), stopwords("de"),
        stopwords("fr"), stopwords("it"), extended_stopwords
      ), selection = "remove"
  )
# Bigram Wordcloud
bi_gram_tokens <- tokens_ngrams(tokens, n = 2)
dfm_bi_gram <- dfm(bi_gram_tokens)
freqs_bi_gram <- sort(colSums(dfm_bi_gram), decreasing = TRUE)
head(freqs_bi_gram, 20)
##          right_arrow             htw_chur          index_point 
##                  421                  259                  207 
##       backhand_index     hochschul_luzern          point_right 
##                  206                  185                  183 
## berner_fachhochschul        sozial_arbeit              prof_dr 
##                  157                  154                  142 
##            haut_cole      herzlich_gratul            open_book 
##                  141                  139                  117 
##        magnifi_glass           glass_tilt           tilt_right 
##                   97                   97                   97 
##         fh_graubnden  neusten_blogbeitrag   book_#revuehmisphr 
##                   91                   87                   85 
##         social_media         advanc_studi 
##                   84                   83
# Create the bigram word cloud
set.seed(123)
wordcloud(
  words = names(freqs_bi_gram),
  freq = freqs_bi_gram,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

# Trigram Wordcloud
tri_gram_tokens <- tokens_ngrams(tokens, n = 3)
dfm_tri_gram <- dfm(tri_gram_tokens)
reqs_tri_gram <- sort(colSums(dfm_tri_gram), decreasing = TRUE)
head(reqs_tri_gram, 20)
##         backhand_index_point            index_point_right 
##                          206                          183 
##           magnifi_glass_tilt             glass_tilt_right 
##                           97                           97 
##      open_book_#revuehmisphr   hochschul_gestaltung_kunst 
##                           85                           62 
## dipartimento_tecnologi_innov          master_advanc_studi 
##                           40                           38 
##         depart_sozial_arbeit       #infoanlass_mrz_findet 
##                           36                           33 
##              polic_car_light         univers_appli_scienc 
##                           32                           31 
##         busi_administr_statt     findet_#zrich_infoanlass 
##                           30                           30 
##               tag_offenen_tr        hochschul_life_scienc 
##                           29                           29 
##        gestaltung_kunst_fhnw           mas_busi_administr 
##                           29                           28 
##       mehr_neuen_blogbeitrag     mehr_neusten_blogbeitrag 
##                           28                           28
# Create the bigram word cloud
set.seed(123)
wordcloud(
  words = names(reqs_tri_gram),
  freq = reqs_tri_gram,
  max.words = 200,
  random.order = FALSE,
  colors = brewer.pal(8, "Dark2")
)

LDA Topic Modeling

The Latent Dirichlet Allocation (LDA) model was applied to the entire dataset of tweets to identify common topics. Here, the model with 5 topics was selected, and the top terms for each topic were extracted.

General Topics Identified:

  • Topic 1: Includes terms like “student”, “project”, “innov”, “scienc”, indicating a focus on academic and research activities.
  • Topic 2: Contains words such as “thank”, “welcome”, “now”, “today”, highlighting interactions and expressions of gratitude or positive events.
  • Topic 3: Features terms like “student”, “research”, “switzerland”, “new”, focusing also on academic and research activities.
  • Topic 4: Involves terms like “digit”, “univers”, “day” this handlea also about academic and research activities but also about digitalization.
  • Topic 5: Includes words such as “new”, “student”, “present”, “conf”, emphasizing new developments, student activities, and events or conferences.

It seems that these topics are not that far apart, which could indicate that the tweets are quite similar in content and style. The LDA model still shows us that topics like academic and research activities, student activities, and new developments are relevant in the tweets.

# Source: Christoph Zangger -> löscht alle Reihen mit nur 0s
new_dfm <- dfm_subset(dfm_list$en, ntoken(dfm_list$en) > 0)
tweet_lda <- LDA(new_dfm, k = 5, control = list(seed = 123))
# Tidy the LDA results
topic_terms <- tidy(tweet_lda, matrix = "beta")
# Extract topics and top terms
topics <- as.data.frame(terms(tweet_lda, 50)) # First fifty words per topic

# Extract top terms per topic
top_terms <- topic_terms %>%
  group_by(topic) %>%
  top_n(8, beta) %>%
  ungroup() %>%
  arrange(topic, -beta)

print(as.data.frame(terms(tweet_lda, 20)))
##      Topic 1      Topic 2       Topic 3   Topic 4     Topic 5
## 1      @hslu       scienc         @hslu   univers         new
## 2      innov        @zhaw       student       day     student
## 3      studi        today         innov   student   @supsi_ch
## 4    student        thank           new     appli       swiss
## 5      start  switzerland      research     @zhaw       @zhaw
## 6      thank      project   switzerland      open     univers
## 7    project    @bfh_hesb            us     digit     present
## 8        now       welcom           now     great       manag
## 9        can          now         @fhnw     swiss      confer
## 10    scienc    @fhnwbusi           day    welcom switzerland
## 11     right #switzerland       project      team       studi
## 12     swiss       confer           der @bfh_hesb       great
## 13    #innov        learn         event     manag       @fhnw
## 14 @supsi_ch      congrat      challeng      call       today
## 15     video      student          join      next     project
## 16      educ         open         swiss    design        take
## 17   develop         join @academies_ch      read       happi
## 18      hand         last   #digitale21    scienc        join
## 19    intern     research         visit     thank     exchang
## 20 @fhnwbusi        @hslu       develop     learn     forward
# Visualize top terms per topic
top_terms %>%
  mutate(term = reorder_within(term, beta, topic)) %>%
  ggplot(aes(beta, term, fill = factor(topic))) +
  geom_col(show.legend = FALSE) +
  facet_wrap(~topic, scales = "free") +
  scale_y_reordered() +
  labs(
    x = "Beta (Term Importance within Topic)",
    y = "Terms",
    title = "Top Terms per Topic in Tweets (LDA)"
  ) +
  theme_minimal()

# Most different words among topics (using log ratios)
diff <- topic_terms %>%
  mutate(topic = paste0("topic", topic)) %>%
  spread(topic, beta) %>%
  filter(topic1 > .001 | topic2 > .001 | topic3 > .001) %>%
  mutate(
    logratio_t1t2 = log2(topic2 / topic1),
    logratio_t1t3 = log2(topic3 / topic1),
    logratio_t2t3 = log2(topic3 / topic2)
  )
diff
## # A tibble: 313 × 9
##    term       topic1  topic2  topic3  topic4  topic5 logratio_t1t2 logratio_t1t3
##    <chr>       <dbl>   <dbl>   <dbl>   <dbl>   <dbl>         <dbl>         <dbl>
##  1 @academi… 2.26e-3 6.89e-4 4.17e-3 8.28e-5 2.11e-3        -1.71         0.881 
##  2 @bfh_hesb 1.03e-3 5.87e-3 1.19e-3 4.82e-3 2.62e-3         2.52         0.218 
##  3 @empa_ch  1.07e-3 4.16e-6 1.58e-4 5.76e-5 4.35e-4        -8.01        -2.76  
##  4 @enginee… 6.40e-5 1.38e-4 1.73e-3 1.19e-4 7.10e-4         1.11         4.75  
##  5 @esnchur  5.17e-4 4.43e-4 1.03e-3 4.06e-4 1.90e-5        -0.222        0.992 
##  6 @fh_grau… 6.45e-4 1.45e-3 6.90e-4 9.07e-4 7.90e-4         1.17         0.0983
##  7 @fhnw     3.90e-4 3.08e-3 5.15e-3 2.83e-3 4.77e-3         2.98         3.72  
##  8 @fhnwbusi 3.75e-3 5.59e-3 1.95e-3 4.77e-4 1.00e-3         0.577       -0.945 
##  9 @greater… 5.47e-4 1.24e-3 1.09e-3 1.22e-3 7.35e-4         1.18         0.993 
## 10 @grstift… 2.04e-4 3.60e-3 1.59e-3 1.25e-3 6.04e-4         4.14         2.96  
## # ℹ 303 more rows
## # ℹ 1 more variable: logratio_t2t3 <dbl>
# LDA Topic Modeling for each university
universities <- unique(tweets$university)

for (uni in universities) {
  # Filter tweets for the current university
  uni_tweets <- tweets %>% filter(university == uni & lang %in% langs)

  tokens_uni <- uni_tweets %>%
    corpus(text_field = "full_text_emojis") %>%
    tokens(
      remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE,
      remove_url = TRUE, remove_separators = TRUE
    ) %>%
    tokens_tolower() %>%
    tokens_wordstem() %>%
    tokens_ngrams(n = 1) %>%
    tokens_select(
      pattern =
        c(
          stopwords("en"), stopwords("de"),
          stopwords("fr"), stopwords("it"), extended_stopwords
        ), selection = "remove"
    )
  uni_dfm <- dfm(tokens_uni)
  # Apply LDA
  uni_dfm <- dfm_subset(uni_dfm, ntoken(uni_dfm) > 0)
  tweet_lda <- LDA(uni_dfm, k = 5, control = list(seed = 123))
  # Tidy the LDA results
  tweet_lda_td <- tidy(tweet_lda)
  # Extract top terms per topic
  top_terms <- tweet_lda_td %>%
    group_by(topic) %>%
    top_n(8, beta) %>%
    ungroup() %>%
    arrange(topic, -beta)
  # Visualize top terms per topic
  p <- top_terms %>%
    mutate(term = reorder_within(term, beta, topic)) %>%
    ggplot(aes(beta, term, fill = factor(topic))) +
    geom_col(show.legend = FALSE) +
    facet_wrap(~topic, scales = "free") +
    scale_y_reordered() +
    labs(
      x = "Beta (Term Importance within Topic)",
      y = NULL,
      title = paste("Top Terms per Topic in Tweets from", uni, "(LDA)")
    )
  print(p)
  # Topic Model Summary: top 10 terms per topic
  cat("\nTopic Model Summary for", uni, ":\n")
  print(as.data.frame(terms(tweet_lda, 10)))
}

## 
## Topic Model Summary for FHNW :
##         Topic 1         Topic 2      Topic 3         Topic 4      Topic 5
## 1      @hsafhnw            fhnw    @fhnwbusi       @fhnwbusi    @fhnwbusi
## 2          neue       hochschul     @hsafhnw            fhnw         mehr
## 3          heut           index        @fhnw          campus        @fhnw
## 4          fhnw        backhand    hochschul      entwickelt @fhnwtechnik
## 5  @fhnwtechnik @fhnwpsychologi @fhnwtechnik  brugg-windisch        neuen
## 6         olten           point         heut            mehr         neue
## 7          mehr         projekt         mehr @fhnwpsychologi           ab
## 8     hochschul            heut   studierend    @fhnwtechnik   studierend
## 9         swiss           right         fhnw           basel         geht
## 10        basel           neuen         gibt       schweizer       erklrt

## 
## Topic Model Summary for FH_Graubuenden :
##       Topic 1         Topic 2         Topic 3     Topic 4     Topic 5
## 1        chur    @htwchurtour            chur       statt        mehr
## 2         htw         student           studi      findet blogbeitrag
## 3       #chur        #htwchur             via #infoanlass       neuen
## 4    #htwchur @suedostschweiz             htw  infoanlass       #fhgr
## 5    #studium         tourism        #htwchur    #htwchur   graubnden
## 6       #fhgr @fh_graubuenden    #smartcultur        chur     neusten
## 7   graubnden      studierend            heut       manag          fh
## 8    #schweiz            dank @suedostschweiz         mas        prof
## 9  #graubnden   @infowisschur           media        busi          dr
## 10    studium        institut    @clickandtri         mrz        wurd

## 
## Topic Model Summary for ZHAW :
##             Topic 1          Topic 2          Topic 3         Topic 4
## 1              zhaw            knnen             zhaw           @zhaw
## 2  @engineeringzhaw @engineeringzhaw @engineeringzhaw              cc
## 3             @zhaw            #zhaw             neue     #zhawimpact
## 4        winterthur            zeigt             geht            zhaw
## 5             knnen            studi             mehr @iam_winterthur
## 6         @sml_zhaw             neue               cc            heut
## 7               via             dank            zeigt           #zhaw
## 8             zeigt        schweizer            #zhaw             via
## 9              heut             heut             gibt         zukunft
## 10               ab            neuen            immer           neuen
##            Topic 5
## 1            @zhaw
## 2            #zhaw
## 3  @iam_winterthur
## 4            studi
## 5             zhaw
## 6             dank
## 7             mehr
## 8        @sml_zhaw
## 9          #toniar
## 10        #tonitag

## 
## Topic Model Summary for bfh :
##          Topic 1     Topic 2    Topic 3       Topic 4         Topic 5
## 1            bfh        bern        bfh           bfh            neue
## 2  fachhochschul   @bfh_hesb  @bfh_hesb         thema           thema
## 3         berner      berner    schweiz          mehr #knoten_maschen
## 4           bern        mehr       mehr         knnen          berner
## 5           mehr         bfh      neuen      @hkb_bfh            bern
## 6       @hkb_bfh unternehmen       neue          bern       schweizer
## 7     entwickelt        neue      innen         statt           welch
## 8       anmelden     projekt       bern          biel            gibt
## 9         arbeit          ab gesundheit fachhochschul      @bfh_sosec
## 10        sozial       thema  schweizer      erfahren             day

## 
## Topic Model Summary for hes_so :
##          Topic 1 Topic 2  Topic 3       Topic 4         Topic 5
## 1         projet   right   hes-so         right          hes-so
## 2          arrow  projet  tudiant         arrow            haut
## 3            dan    cole      dan          open           right
## 4          right    tilt  @hes_so          book         tudiant
## 5        #hes_so     dan    arrow        master           arrow
## 6    @hessovalai   glass   projet #revuehmisphr          projet
## 7       dcouvrez   arrow recherch         suiss             dan
## 8        @hes_so   journ  nouveau      recherch     @hessovalai
## 9  #revuehmisphr  format    suiss          cole @radiotelesuiss
## 10      particip @hes_so     haut        scienc         #hes_so

## 
## Topic Model Summary for hslu :
##            Topic 1   Topic 2   Topic 3   Topic 4         Topic 5
## 1            @hslu    luzern     @hslu     zeigt #hsluinformatik
## 2           luzern     @hslu    luzern      heut            mehr
## 3             mehr    depart hochschul     digit       schweizer
## 4  #hsluinformatik hochschul     knnen      mehr           @hslu
## 5             neue      heut      mehr hochschul         projekt
## 6           depart      neue    campus     welch #hsluwirtschaft
## 7            thema      mehr      jahr   projekt         schweiz
## 8           design      geht   #hsludk      neue          luzern
## 9        schweizer     zeigt    findet   schweiz      studierend
## 10           statt     knnen     statt     studi           studi

## 
## Topic Model Summary for ost_fh :
##                       Topic 1                    Topic 2
## 1  #ostschweizerfachhochschul #ostschweizerfachhochschul
## 2                     @ost_fh                    @ost_fh
## 3                    bachelor                        ost
## 4                        neue                   @ozg_ost
## 5                  #diplomfei                   kontrast
## 6                     podcast               kulturzyklus
## 7                          ab            #ausdreiwirdein
## 8                   #bachelor                        fhs
## 9            online-infoabend                       #ost
## 10                      statt                       mehr
##                       Topic 3                    Topic 4
## 1                     @ost_fh                        ost
## 2                         ost                    @ost_fh
## 3  #ostschweizerfachhochschul                   @ozg_ost
## 4             rapperswil-jona #ostschweizerfachhochschul
## 5                      campus                 rapperswil
## 6                      detail                       neue
## 7                         neu                      neuen
## 8      #wirtschaftsinformatik                 ostschweiz
## 9                        gibt                  st.gallen
## 10                #informatik                       mehr
##                       Topic 5
## 1  #ostschweizerfachhochschul
## 2                    menschen
## 3                        mehr
## 4                     @ost_fh
## 5                         ost
## 6                     spricht
## 7                        prof
## 8                    institut
## 9                        team
## 10                   @ozg_ost

## 
## Topic Model Summary for supsi_ch :
##         Topic 1      Topic 2      Topic 3   Topic 4  Topic 5
## 1      progetto    formazion   iscrizioni #supsinew    supsi
## 2      #supsiev     studenti       master     supsi     info
## 3  @usi_univers         oggi        supsi  #supsiev bachelor
## 4     #supsinew informazioni        innov  studenti    corsi
## 5     competenz        supsi        apert     right       pi
## 6         arrow       ottobr         info formazion #supsiev
## 7      svizzera    @supsi_ch     bachelor @supsi_ch    nuovo
## 8     @supsi_ch dipartimento       design      info progetto
## 9  informazioni     maggiori       advanc    master    studi
## 10        supsi   iscrizioni @usi_univers   ricerca   campus

Conclusion:

The LDA analysis reveals distinct topics across all tweets, emphasizing academic activities, events, and institutional developments. Overall the topics over all universities seem to be quite similar. Which indicates that the tweets are similar in content and style. However, the LDA model shows that topics like academic and research activities, student activities, and new developments are relevant in the tweets. Also because we stemmed the words could be not that understandable for the algorithm, this may have an effect on the choosen topics.

Style Analysis

The distribution of tweet lengths shows variation across universities. Most tweets are concise, aligning with Twitter’s character limit, but the exact length distribution differs among institutions. It is interesting to see that much tweets have around 150 words and that the tweets from the universities are not that long. It is a typical sign that the tweets are not that long and this is a common thing in social media.

tweets %>%
  mutate(tweet_length = nchar(full_text)) %>%
  ggplot(aes(x = tweet_length)) +
  geom_histogram() +
  labs(title = "Distribution of Tweet Lengths")

Sentiment Analysis

Sentiment analysis was conducted to evaluate the emotional tone of the tweets. The analysis used the Syuzhet method to calculate sentiment scores for each tweet. Syuzhet was chosen because of its ability to capture the emotions within text. Which makes it effective for analyzing narrative structures. Additionally, the NRC sentiment dictionary was utilized for its comprehensive coverage of emotions across multiple languages, including German, Italian, and French. See Documentation: https://cran.r-project.org/web/packages/syuzhet/vignettes/syuzhet-vignette.html#:~:text=Multilingual%20Sentiment%20Lexicons. Together, Syuzhet and NRC provide a powerful combination for performing detailed and multilingual sentiment analysis.

Overall Sentiment Trends: - The sentiment scores vary over time and by university, showing fluctuations in the emotional tone of the tweets. - Positive words commonly found in tweets include terms related to academic achievements, collaborations, and positive experiences. - Negative words often relate to challenges, competitions, and issues faced by the universities.

# Calculate Sentiment for Supported Languages Only
langs <- c("de", "fr", "it", "en")

tweets_filtered <- tweets %>%
  filter(lang %in% langs)

# Function to get sentiment based on language
get_multilang_sentiment <- function(text, lang) {
  if (lang == "de") {
    return(get_sentiment(text, method = "nrc", language = "german"))
  } else if (lang == "it") {
    return(get_sentiment(text, method = "nrc", language = "italian"))
  } else if (lang == "fr") {
    return(get_sentiment(text, method = "nrc", language = "french"))
  } else if (lang == "en") {
    return(get_sentiment(text, method = "syuzhet"))
  } else {
    return(NA) # Return NA for unsupported languages
  }
}
# Calculate Syuzhet Sentiment for each Tweet
tweets_filtered$sentiment <-
  mapply(
    get_multilang_sentiment,
    tweets_filtered$full_text, tweets_filtered$lang
  )

plot_data <- tweets_filtered %>%
  group_by(university, month) %>%
  summarize(mean_sentiment = mean(sentiment, na.rm = TRUE))

# Convert month to Date format
plot_data <- plot_data %>%
  mutate(month = as.Date(month, format = "%Y-%m-%d"))

# Plot Sentiment by all Universities
ggplot(plot_data, aes(
  x = month,
  y = mean_sentiment,
  color = university, group = university
)) +
  geom_line() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(
    title = "Mean Sentiment Over Time by University",
    y = "Mean Sentiment Score",
    x = "Month"
  ) +
  scale_x_date(date_breaks = "1 month", date_labels = "%Y-%m") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  theme_minimal()

for (uni in unique(tweets_filtered$university)) {
  most_used_lang <- tweets %>%
    filter(university == uni) %>%
    count(lang) %>%
    slice_max(n = 1, order_by = n) %>%
    pull(lang)

  uni_tweets <- tweets_filtered %>%
    filter(university == uni & lang == most_used_lang)

  plot_data <- uni_tweets %>%
    group_by(month) %>%
    summarize(mean_sentiment = mean(sentiment, na.rm = TRUE))

  # Plot Syuzhet Sentiment Over Time (Per University)
  print(ggplot(plot_data, aes(x = month, y = mean_sentiment, group = 1)) +
    geom_line() +
    geom_smooth(method = "lm", se = FALSE, color = "red") +
    labs(
      title = paste0("Mean Syuzhet Sentiment Over Time by - ", uni),
      y = "Mean Sentiment Score",
      x = "Month"
    ))

  # Tokenize and Preprocess Words
  uni_words <- uni_tweets %>%
    unnest_tokens(word, full_text_emojis) %>%
    filter(nchar(word) > 3) %>%
    filter(!str_detect(word, "\\d")) %>%
    filter(!str_detect(word, "https?://\\S+|www\\.\\S+|t\\.co|http|https"))

  # Remove stopwords after counting word frequency
  word_counts <- uni_words %>%
    count(word, sort = TRUE) %>%
    anti_join(get_stopwords(language = most_used_lang), by = "word")

  sentiment_words <- word_counts %>%
    mutate(sentiment = get_multilang_sentiment(word, most_used_lang))

  # Separate Positive and Negative Words
  positive_words <- sentiment_words %>%
    filter(sentiment >= 0) %>%
    arrange(desc(n)) %>%
    rename(freq = n)

  negative_words <- sentiment_words %>%
    filter(sentiment < 0) %>%
    arrange(desc(n)) %>%
    rename(freq = n)

  # Create and Display Word Clouds
  print(paste0("Positive words for: ", uni, " in ", most_used_lang))
  print(head(positive_words, 20))
  wordcloud(
    words = positive_words$word,
    freq = positive_words$freq,
    max.words = 200,
    random.order = FALSE,
    colors = brewer.pal(8, "Dark2")
  )

  print(paste0("Negative words for: ", uni, " in ", most_used_lang))
  print(head(negative_words, 20))
  wordcloud(
    words = negative_words$word,
    freq = negative_words$freq,
    max.words = 200,
    random.order = FALSE,
    colors = brewer.pal(8, "Dark2")
  )
}

## [1] "Positive words for: FHNW in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word             freq sentiment
##    <chr>      <chr>           <int>     <dbl>
##  1 FHNW       fhnw             1598         0
##  2 FHNW       fhnwbusiness      342         0
##  3 FHNW       hsafhnw           253         0
##  4 FHNW       mehr              232         0
##  5 FHNW       hochschule        219         0
##  6 FHNW       fhnwtechnik       204         0
##  7 FHNW       campus            170         0
##  8 FHNW       heute             147         0
##  9 FHNW       fhnwpsychologie   138         0
## 10 FHNW       basel             134         0
## 11 FHNW       neue              134         0
## 12 FHNW       studierende       123         0
## 13 FHNW       brugg             111         0
## 14 FHNW       neuen             108         0
## 15 FHNW       schweiz           107         0
## 16 FHNW       windisch          101         0
## 17 FHNW       olten              98         0
## 18 FHNW       knnen              97         0
## 19 FHNW       projekt            86         0
## 20 FHNW       prof               83         0

## [1] "Negative words for: FHNW in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word             freq sentiment
##    <chr>      <chr>           <int>     <dbl>
##  1 FHNW       wettbewerb         13        -1
##  2 FHNW       problem            12        -1
##  3 FHNW       trotz              12        -2
##  4 FHNW       bakterien          10        -1
##  5 FHNW       herausforderung    10        -1
##  6 FHNW       junge              10        -1
##  7 FHNW       krise              10        -1
##  8 FHNW       sucht              10        -1
##  9 FHNW       spiel               8        -1
## 10 FHNW       spielen             6        -1
## 11 FHNW       kaum                5        -1
## 12 FHNW       lust                5        -1
## 13 FHNW       nachfrage           5        -1
## 14 FHNW       reihe               5        -1
## 15 FHNW       schwer              5        -3
## 16 FHNW       tragen              5        -1
## 17 FHNW       warten              5        -1
## 18 FHNW       angriff             4        -2
## 19 FHNW       bisher              4        -1
## 20 FHNW       eingeweiht          4        -1

## [1] "Positive words for: FH_Graubuenden in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university     word            freq sentiment
##    <chr>          <chr>          <int>     <dbl>
##  1 FH_Graubuenden htw_chur         553         0
##  2 FH_Graubuenden chur             469         0
##  3 FH_Graubuenden htwchur          345         0
##  4 FH_Graubuenden infoanlass       327         0
##  5 FH_Graubuenden statt            299         0
##  6 FH_Graubuenden findet           271         0
##  7 FH_Graubuenden graubnden        258         0
##  8 FH_Graubuenden studium          255         0
##  9 FH_Graubuenden mehr             230         0
## 10 FH_Graubuenden blogbeitrag      213         0
## 11 FH_Graubuenden tourismus        177         0
## 12 FH_Graubuenden fhgr             172         0
## 13 FH_Graubuenden bachelor         162         0
## 14 FH_Graubuenden management       136         0
## 15 FH_Graubuenden neuen            128         0
## 16 FH_Graubuenden heute            125         0
## 17 FH_Graubuenden schweiz          123         0
## 18 FH_Graubuenden multimedia       107         0
## 19 FH_Graubuenden photonics        104         0
## 20 FH_Graubuenden suedostschweiz   104         0

## [1] "Negative words for: FH_Graubuenden in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university     word             freq sentiment
##    <chr>          <chr>           <int>     <dbl>
##  1 FH_Graubuenden sucht              15        -1
##  2 FH_Graubuenden wettbewerb         13        -1
##  3 FH_Graubuenden korruption         12        -1
##  4 FH_Graubuenden junge              10        -1
##  5 FH_Graubuenden hoch                8        -1
##  6 FH_Graubuenden herausforderung     7        -1
##  7 FH_Graubuenden rutsch              7        -1
##  8 FH_Graubuenden kick                6        -1
##  9 FH_Graubuenden vergessen           6        -3
## 10 FH_Graubuenden kochen              5        -1
## 11 FH_Graubuenden problem             5        -1
## 12 FH_Graubuenden lust                4        -1
## 13 FH_Graubuenden spielen             4        -1
## 14 FH_Graubuenden fall                3        -1
## 15 FH_Graubuenden falls               3        -1
## 16 FH_Graubuenden gefahr              3        -2
## 17 FH_Graubuenden kampf               3        -5
## 18 FH_Graubuenden krebs               3        -2
## 19 FH_Graubuenden krise               3        -1
## 20 FH_Graubuenden nachfrage           3        -1

## [1] "Positive words for: ZHAW in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word             freq sentiment
##    <chr>      <chr>           <int>     <dbl>
##  1 ZHAW       zhaw              901         0
##  2 ZHAW       engineeringzhaw   236         0
##  3 ZHAW       iam_winterthur    141         0
##  4 ZHAW       studie            139         1
##  5 ZHAW       mehr              138         0
##  6 ZHAW       schweiz           138         0
##  7 ZHAW       knnen             137         0
##  8 ZHAW       heute             134         0
##  9 ZHAW       zeigt             131         0
## 10 ZHAW       schweizer         126         0
## 11 ZHAW       unsere            126         0
## 12 ZHAW       sml_zhaw          123         0
## 13 ZHAW       zhawimpact        120         0
## 14 ZHAW       neue              117         0
## 15 ZHAW       winterthur        117         0
## 16 ZHAW       danke             102         0
## 17 ZHAW       studierende        84         0
## 18 ZHAW       gibt               83         0
## 19 ZHAW       neuen              82         0
## 20 ZHAW       dass               81         0

## [1] "Negative words for: ZHAW in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word             freq sentiment
##    <chr>      <chr>           <int>     <dbl>
##  1 ZHAW       trotz              19        -2
##  2 ZHAW       fall               15        -1
##  3 ZHAW       junge              14        -1
##  4 ZHAW       sucht              11        -1
##  5 ZHAW       falls              10        -1
##  6 ZHAW       krise               9        -1
##  7 ZHAW       problem             9        -1
##  8 ZHAW       spielen             8        -1
##  9 ZHAW       behinderung         7        -5
## 10 ZHAW       herausforderung     7        -1
## 11 ZHAW       bisher              6        -1
## 12 ZHAW       bund                6        -1
## 13 ZHAW       druck               6        -1
## 14 ZHAW       fehlt               5        -2
## 15 ZHAW       hoch                5        -1
## 16 ZHAW       laut                5        -1
## 17 ZHAW       rauch               5        -1
## 18 ZHAW       schwer              5        -3
## 19 ZHAW       stress              5        -1
## 20 ZHAW       tragen              5        -1

## [1] "Positive words for: bfh in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word            freq sentiment
##    <chr>      <chr>          <int>     <dbl>
##  1 bfh        bern             290         0
##  2 bfh        berner           226         0
##  3 bfh        mehr             213         0
##  4 bfh        thema            196         0
##  5 bfh        neue             184         0
##  6 bfh        bfh_hesb         169         0
##  7 bfh        fachhochschule   164         0
##  8 bfh        biel             112         0
##  9 bfh        hkb_bfh          112         0
## 10 bfh        knnen            109         0
## 11 bfh        holz             106         0
## 12 bfh        knoten_maschen   106         0
## 13 bfh        hafl             104         0
## 14 bfh        heute            101         0
## 15 bfh        anmelden         100         0
## 16 bfh        innen            100         0
## 17 bfh        schweiz          100         0
## 18 bfh        schweizer        100         0
## 19 bfh        neuen             96         0
## 20 bfh        online            93         0

## [1] "Negative words for: bfh in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word             freq sentiment
##    <chr>      <chr>           <int>     <dbl>
##  1 bfh        krise              22        -1
##  2 bfh        wettbewerb         19        -1
##  3 bfh        armut              15        -1
##  4 bfh        junge              11        -1
##  5 bfh        herausforderung     9        -1
##  6 bfh        batterie            8        -1
##  7 bfh        boden               8        -2
##  8 bfh        schwer              8        -3
##  9 bfh        stress              8        -1
## 10 bfh        sucht               8        -1
## 11 bfh        bisher              7        -1
## 12 bfh        kosten              7        -1
## 13 bfh        liegen              7        -1
## 14 bfh        prozess             7        -1
## 15 bfh        sterben             7        -1
## 16 bfh        trotz               7        -2
## 17 bfh        fehlt               6        -2
## 18 bfh        hoch                6        -1
## 19 bfh        kaum                6        -1
## 20 bfh        spielen             6        -1

## [1] "Positive words for: hes_so in fr"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word            freq sentiment
##    <chr>      <chr>          <int>     <dbl>
##  1 hes_so     right            433         0
##  2 hes_so     arrow            324         0
##  3 hes_so     hes_so           267         0
##  4 hes_so     recherche        184         0
##  5 hes_so     projet           148         0
##  6 hes_so     master           144         0
##  7 hes_so     hessovalais      133         0
##  8 hes_so     suisse           126         0
##  9 hes_so     open             117         0
## 10 hes_so     book             116         0
## 11 hes_so     haute            108         0
## 12 hes_so     projets          104         0
## 13 hes_so     sant             104         0
## 14 hes_so     tudiantes        104         0
## 15 hes_so     plus             103         0
## 16 hes_so     revuehmisphres    97         0
## 17 hes_so     glass             96         0
## 18 hes_so     magnifying        96         0
## 19 hes_so     tilted            96         0
## 20 hes_so     genve             86         0

## [1] "Negative words for: hes_so in fr"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word         freq sentiment
##    <chr>      <chr>       <int>     <dbl>
##  1 hes_so     programme      58        -1
##  2 hes_so     lance          47        -1
##  3 hes_so     entre          46        -1
##  4 hes_so     appel          45        -1
##  5 hes_so     salon          41        -1
##  6 hes_so     contre         19        -1
##  7 hes_so     crise          19        -1
##  8 hes_so     tous           19        -1
##  9 hes_so     livre          13        -1
## 10 hes_so     rencontre      12        -1
## 11 hes_so     vice           11        -1
## 12 hes_so     demande         9        -1
## 13 hes_so     vide            8        -1
## 14 hes_so     destin          7        -1
## 15 hes_so     plein           7        -1
## 16 hes_so     sujet           7        -1
## 17 hes_so     disposition     6        -1
## 18 hes_so     douleur         6        -3
## 19 hes_so     faon            6        -1
## 20 hes_so     campagne        5        -1

## [1] "Positive words for: hslu in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word            freq sentiment
##    <chr>      <chr>          <int>     <dbl>
##  1 hslu       hslu            1645         0
##  2 hslu       luzern           288         0
##  3 hslu       mehr             257         0
##  4 hslu       unsere           201         0
##  5 hslu       hochschule       197         0
##  6 hslu       informatik       183         0
##  7 hslu       hsluinformatik   173         0
##  8 hslu       heute            171         0
##  9 hslu       schweizer        145         0
## 10 hslu       studie           141         1
## 11 hslu       design           137         0
## 12 hslu       neue             136         0
## 13 hslu       zeigt            132         0
## 14 hslu       knnen            124         0
## 15 hslu       arbeit           121         2
## 16 hslu       schweiz          121         0
## 17 hslu       bachelor         119         0
## 18 hslu       departement      116         0
## 19 hslu       studierende      113         0
## 20 hslu       kunst            108         2

## [1] "Negative words for: hslu in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word             freq sentiment
##    <chr>      <chr>           <int>     <dbl>
##  1 hslu       junge              19        -1
##  2 hslu       quiz               19        -1
##  3 hslu       trotz              17        -2
##  4 hslu       wettbewerb         17        -1
##  5 hslu       krise              16        -1
##  6 hslu       spiel              13        -1
##  7 hslu       spielen            12        -1
##  8 hslu       kaum               11        -1
##  9 hslu       tragen              9        -1
## 10 hslu       gewalt              8        -1
## 11 hslu       problem             8        -1
## 12 hslu       sucht               8        -1
## 13 hslu       herausforderung     7        -1
## 14 hslu       hype                7        -1
## 15 hslu       kraft               7        -1
## 16 hslu       bisher              6        -1
## 17 hslu       bund                6        -1
## 18 hslu       druck               6        -1
## 19 hslu       hoch                6        -1
## 20 hslu       kosten              6        -1

## [1] "Positive words for: ost_fh in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word                        freq sentiment
##    <chr>      <chr>                      <int>     <dbl>
##  1 ost_fh     ostschweizerfachhochschule    72         0
##  2 ost_fh     ost_fh                        57         0
##  3 ost_fh     podcast                       32         0
##  4 ost_fh     ozg_ost                       28         0
##  5 ost_fh     mehr                          26         0
##  6 ost_fh     kulturzyklus                  24         0
##  7 ost_fh     bachelor                      22         0
##  8 ost_fh     rapperswil                    22         0
##  9 ost_fh     informatik                    21         0
## 10 ost_fh     online                        21         0
## 11 ost_fh     neue                          18         0
## 12 ost_fh     neuen                         17         0
## 13 ost_fh     st.gallen                     17         0
## 14 ost_fh     ostschweizer                  16         0
## 15 ost_fh     details                       15         0
## 16 ost_fh     kontrast                      15         0
## 17 ost_fh     fachhochschule                14         0
## 18 ost_fh     campus                        13         0
## 19 ost_fh     unsere                        13         0
## 20 ost_fh     eastdigital                   12         0

## [1] "Negative words for: ost_fh in de"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word         freq sentiment
##    <chr>      <chr>       <int>     <dbl>
##  1 ost_fh     reihe           4        -1
##  2 ost_fh     abfall          2        -2
##  3 ost_fh     behinderung     2        -5
##  4 ost_fh     fall            2        -1
##  5 ost_fh     junge           2        -1
##  6 ost_fh     knapp           2        -1
##  7 ost_fh     schlagen        2        -2
##  8 ost_fh     trotz           2        -2
##  9 ost_fh     aufsehen        1        -1
## 10 ost_fh     ausrichtung     1        -1
## 11 ost_fh     blind           1        -2
## 12 ost_fh     blindheit       1        -1
## 13 ost_fh     dringend        1        -1
## 14 ost_fh     entscheiden     1        -1
## 15 ost_fh     epilepsie       1        -1
## 16 ost_fh     fesseln         1        -1
## 17 ost_fh     gegenstand      1        -1
## 18 ost_fh     gegner          1        -2
## 19 ost_fh     gewalt          1        -1
## 20 ost_fh     hype            1        -1

## [1] "Positive words for: supsi_ch in it"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word          freq sentiment
##    <chr>      <chr>        <int>     <dbl>
##  1 supsi_ch   supsi          959         0
##  2 supsi_ch   supsinews      152         0
##  3 supsi_ch   bachelor       147         0
##  4 supsi_ch   formazione     146         0
##  5 supsi_ch   info           138         0
##  6 supsi_ch   studenti       136         0
##  7 supsi_ch   supsievent     136         0
##  8 supsi_ch   master         129         0
##  9 supsi_ch   lugano         125         0
## 10 supsi_ch   progetto       125         0
## 11 supsi_ch   nuovo          115         0
## 12 supsi_ch   ticino         115         0
## 13 supsi_ch   iscrizioni     114         0
## 14 supsi_ch   dipartimento   102         0
## 15 supsi_ch   oggi           102         0
## 16 supsi_ch   ricerca        100         1
## 17 supsi_ch   informazioni    97         1
## 18 supsi_ch   right           95         0
## 19 supsi_ch   scopri          90         0
## 20 supsi_ch   svizzera        90         0

## [1] "Negative words for: supsi_ch in it"
## # A tibble: 20 × 4
## # Groups:   university [1]
##    university word         freq sentiment
##    <chr>      <chr>       <int>     <dbl>
##  1 supsi_ch   incontro       17        -1
##  2 supsi_ch   partire        13        -1
##  3 supsi_ch   partenza       12        -1
##  4 supsi_ch   emergenza      11        -1
##  5 supsi_ch   crisi          10        -1
##  6 supsi_ch   sfida          10        -2
##  7 supsi_ch   campagna        8        -1
##  8 supsi_ch   discutere       6        -1
##  9 supsi_ch   gruppo          6        -1
## 10 supsi_ch   rischio         6        -1
## 11 supsi_ch   disciplina      5        -1
## 12 supsi_ch   giovanni        5        -1
## 13 supsi_ch   intervento      5        -1
## 14 supsi_ch   conseguenze     4        -1
## 15 supsi_ch   disagio         4        -3
## 16 supsi_ch   fondo           4        -1
## 17 supsi_ch   gioco           4        -1
## 18 supsi_ch   parole          4        -1
## 19 supsi_ch   perdere         4        -2
## 20 supsi_ch   periodo         4        -1

Conclusion:

The analysis indicates that Swiss Universities of Applied Sciences exhibit diverse tweeting patterns in terms of content, style, and emotions. Tweets often focus on academic achievements, projects, and institutional news, with varying emotional tones across different universities.

Question 4: What specific advice can you give us as communication department of BFH based on your analysis? How can we integrate the analysis of tweets in our internal processes, can you think of any data products that would be of value for us?

# Language Analysis
tweets %>%
  filter(university == "bfh") %>%
  count(lang) %>%
  arrange(desc(n))
## # A tibble: 9 × 3
## # Groups:   university [1]
##   university lang      n
##   <chr>      <chr> <int>
## 1 bfh        de     3008
## 2 bfh        en      135
## 3 bfh        fr       35
## 4 bfh        qam       8
## 5 bfh        da        2
## 6 bfh        es        2
## 7 bfh        lt        2
## 8 bfh        it        1
## 9 bfh        zxx       1
# Pie chart of langugaes
tweets %>%
  filter(university == "bfh") %>%
  count(lang) %>%
  ggplot(aes(x = "", y = n, fill = lang)) +
  geom_bar(width = 1, stat = "identity") +
  coord_polar("y") +
  labs(title = "Language Distribution of Tweets by BFH")

# Emoji Analysis
emoji_count <- tweets %>%
  filter(university == "bfh") %>%
  top_n_emojis(full_text)

print(emoji_count)
## # A tibble: 20 × 4
##    emoji_name                    unicode emoji_category        n
##    <chr>                         <chr>   <chr>             <int>
##  1 backhand_index_pointing_right 👉      People & Body        49
##  2 battery                       🔋      Objects              16
##  3 backhand_index_pointing_down  👇      People & Body        12
##  4 woman                         👩      People & Body        12
##  5 palm_tree                     🌴      Animals & Nature     11
##  6 bulb                          💡      Objects              10
##  7 computer                      💻      Objects              10
##  8 evergreen_tree                🌲      Animals & Nature     10
##  9 graduation_cap                🎓      Objects              10
## 10 party_popper                  🎉      Activities           10
## 11 robot                         🤖      Smileys & Emotion    10
## 12 rocket                        🚀      Travel & Places      10
## 13 clap                          👏      People & Body         9
## 14 coconut                       🥥      Food & Drink          9
## 15 date                          📅      Objects               9
## 16 deciduous_tree                🌳      Animals & Nature      9
## 17 flag_Switzerland              🇨🇭      Flags                 9
## 18 automobile                    🚗      Travel & Places       8
## 19 clinking_glasses              🥂      Food & Drink          8
## 20 seedling                      🌱      Animals & Nature      8
# Sum of all emojis used
print(sum(emoji_count$n))
## [1] 239
emoji_count %>%
  mutate(emoji_name = reorder(emoji_name, n)) %>%
  ggplot(aes(n, emoji_name)) +
  geom_col(fill = "#37556E") +
  labs(
    x = "Count",
    y = "Emoji",
    title = "Top 20 Emojis Used by BFH"
  ) +
  theme_minimal()

heatmap_data_bfh <- tweets %>%
  filter(university == "bfh") %>%
  count(day, timeofday_hour) %>%
  complete(day, timeofday_hour, fill = list(n = 0))

# Show the pattern of the post time by day and hour. You can see clearly the working hours which are the time where the most tweets are posted.
ggplot(heatmap_data_bfh, aes(x = day, y = timeofday_hour, fill = n)) +
  geom_tile(color = "white") +
  scale_fill_gradient(low = "lightblue", high = "darkblue") +
  labs(title = "Heatmap of Tweet Activity", x = "Day", y = "Hour") +
  theme_minimal()

engagement_hour_bfh <- tweets %>%
  filter(university == "bfh") %>%
  group_by(timeofday_hour) %>%
  summarise(avg_engagement = mean(weighted_engagement, na.rm = TRUE))

# When we look at the engagement by hour, we can see that the most engagement is around 4pm until 8pm. There is also a slight peak in the midday
ggplot(engagement_hour, aes(x = timeofday_hour, y = avg_engagement)) +
  geom_bar(stat = "identity", fill = "#37556E") +
  labs(title = "Average Engagement by Hour", x = "Hour", y = "Avg Engagement") +
  theme_minimal()

plot_data_bfh <- tweets_filtered %>%
  filter(university == "bfh") %>%
  group_by(month) %>%
  summarize(mean_sentiment = mean(sentiment, na.rm = TRUE))

ggplot(plot_data_bfh, aes(x = month, y = mean_sentiment)) +
  geom_line() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(
    title = "Mean Syuzhet Sentiment Over Time by BFH",
    y = "Mean Sentiment Score",
    x = "Month"
  ) +
  theme_minimal()

insights <- list(
  "Most Active Hours" = hours_with_most_tweets_by_uni,
  "Most Active Days" = days_with_most_tweets_by_uni,
  "Content Analysis" = head(words_freqs_de),
  "Sentiment Analysis" = head(tweets_filtered$sentiment)
)

Key Insights

Based on the analysis, the following recommendations and key insights can be made to enhance BFH’s communication strategy:

1. Language and Emoji Usage:

  • Language: BFH predominantly tweets in German, with 3008 tweets in this language.
  • Emojis: There are emojis like “👉” (backhand index pointing right), “🔋” (battery), and “👇” (backhand index pointing down) that are frequently used. However, the overall emoji usage is relatively low.

2. Optimal Posting Times and Days:

  • Most Active Hours: The analysis revealed that 8 AM is the most active hour for BFH tweets. However, tweets posted around 4 PM and 8 PM tend to receive higher engagement.
  • Most Active Days: Tuesday is the most active day for BFH tweets. This resonates already well with the most liked tweets. But the peak on Tuesday is not very significant.

3. Content and Sentiment Analysis:

  • Content Themes: Frequent words in BFH tweets include “Bern”, “Projekt” (project), and “Hochschule” (university). This indicates a focus on academic projects and institutional updates.
  • Sentiment: Positive words often relate to academic achievements and successful projects, while negative words relate to challenges and competitions. A challenge is often subtitued with problem or issue. We suggest to use more positive words in the tweets to get a better sentiment score.

4. Engagement Patterns:

  • Peak Engagement: The highest average engagement occurs around 8 PM, with notable peaks at 4 PM and 6 PM.
  • Consistent Engagement: Engagement remains relatively consistent from 6 AM to 5 PM, with moderate fluctuations.

Recommendations:

  • Twet Schedule: Schedule tweets during the most active hours (4 PM, 6 PM, and 8 PM) to maximize engagement. Consider aligning important announcements and updates with these times and specific days for better visibility. Implement a regular schedule, for example, posting about new projects or student achievements each Tuesday at 4 PM.
  • Multilangual Tweets: The BFH may enhance its multilingual communication strategy by incorporating tweets in English and French to reach a broader audience.
  • Emphasize Positive Themes: Highlight student achievements, successful projects, and positive institutional news to generate positive reactions. Posts like “Congratulations to our students for their successful bachelor thesis” will likely receive positive reactions.
  • Utilize Topic Insights: Emphasize key themes such as academic projects, student updates, and digital initiatives to align content with audience interests.
  • Emojis: Conduct further research on the impact of emojis on tweet popularity, as current analysis does not show a strong correlation.
  • Interactive Content: Use polls, Q&A sessions, and live tweets during events to increase engagement.
  • User-Generated Content: Encourage students and faculty to share their experiences and tag the university in their posts. Retweeting and highlighting these posts can create a sense of community and increase engagement.
  • Demographic Insights: Analyze the demographics of the followers engaging with BFH tweets. Tailor content to resonate with different audience segments. This is not covered in this document.

Integrating Tweet Analysis into Internal Processes:

1. Real-Time Tweet Dashboard:

Implement a dashboard to track tweet performance, including engagement metrics, sentiment scores, and trending topics. This allows adjusting the tweet schedule and audience targeting in real-time. Monthly reports could also be sufficient for strategic adjustments.

2. Content Calendar:

Develop a content calendar that aligns tweet releases with peak engagement times and days. This helps plan tweets in advance and ensures they are posted at optimal times.

3. Feedback Loop:

Establish a feedback loop where the communication team reviews analytics data and adjusts strategies according to the findings. Regular team meetings can help discuss insights and support a data-driven approach to communication.

Potential Data Products:

  1. Engagement Prediction Tool: Develop a tool that predicts the best times to tweet based on historical data. This optimizes tweet scheduling for maximum engagement and can also predict periodic peaks in engagement.
  2. Sentiment Analysis Bot: Develop a tool to analyze the sentiment of drafts before posting, ensuring an appropriate and positive tone. This should aldough not be used as a content controll (We’re not in China)
  3. Trend Tracker: Create a feature that identifies emerging topics and trends in real-time. This allows the communication team to quickly adapt and incorporate relevant themes into their messaging.
  4. Influencer Tool: Identify key influencers who engage with BFH’s tweets. This can include industry leaders, alumni, or current students with significant followings.

By integrating these recommendations and tools, BFH can enhance its communication strategy, ensuring that its messages are timely, relevant, and engaging for its audience.